Provable Tensor Methods for Learning Mixtures of Classifiers

نویسندگان

  • Hanie Sedghi
  • Anima Anandkumar
چکیده

We consider the problem of learning associative mixtures for classification and regression problems, where the output is modeled as a mixture of conditional distributions, conditioned on the input. In contrast to approaches such as expectation maximization (EM) or variational Bayes, which can get stuck in bad local optima, we present a tensor decomposition method which is guaranteed to correctly recover the parameters. The key insight is to learn score function features of the input, and employ them in a moment-based approach for learning associative mixtures. Specifically, we construct the cross-moment tensor between the label and higher order score functions of the input. We establish that the decomposition of this tensor consistently recovers the components of the associative mixture under some simple nondegeneracy assumptions. Thus, we establish that feature learning is the critical ingredient for consistent estimation of associative mixtures using tensor decomposition approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Mixtures of Linear Classifiers

We consider a discriminative learning (regression) problem, whereby the regression function is a convex combination of k linear classifiers. Existing approaches are based on the EM algorithm, or similar techniques, without provable guarantees. We develop a simple method based on spectral techniques and a ‘mirroring’ trick, that discovers the subspace spanned by the classifiers’ parameter vector...

متن کامل

Provable Tensor Methods for Learning Mixtures of Generalized Linear Models

which is a multilinear combination of the tensor mode-1 fibers. Similarly T (u, v, w) ∈ R is a multilinear combination of the tensor entries, and T (I, I, w) ∈ Rd×d is a linear combination of the tensor slices. Now, let us proceed with the proof. Proof: Let x′ := 〈u, x〉+ b. Define l(x) := y · x⊗ x. We have E[y · x⊗3] = E[l(x)⊗ x] = E[∇xl(x)], ∗Allen Institute for Artificial Intelligence. Email:...

متن کامل

Provable Learning of Overcomplete Latent Variable Models: Semi-supervised and Unsupervised Settings

We provide guarantees for learning latent variable models emphasizing on the overcomplete regime, where the dimensionality of the latent space can exceed the observed dimensionality. In particular, we consider spherical Gaussian mixtures and multiview mixtures models. Our algorithm is based on method of moments, and employs a tensor decomposition method for learning. In the semi-supervised sett...

متن کامل

Nonparametric Estimation of Multi-View Latent Variable Models

Spectral methods have greatly advanced the estimation of latent variable models, generating a sequence of novel and efficient algorithms with strong theoretical guarantees. However, current spectral algorithms are largely restricted to mixtures of discrete or Gaussian distributions. In this paper, we propose a kernel method for learning multi-view latent variable models, allowing each mixture c...

متن کامل

Higher order Matching Pursuit for Low Rank Tensor Learning

Low rank tensor learning, such as tensor completion and multilinear multitask learning, has received much attention in recent years. In this paper, we propose higher order matching pursuit for low rank tensor learning problems with a convex or a nonconvex cost function, which is a generalization of the matching pursuit type methods. At each iteration, the main cost of the proposed methods is on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1412.3046  شماره 

صفحات  -

تاریخ انتشار 2014